Surfactants or scaffolds? RNAs of varying lengths control the thermodynamic stability of condensates differently

Biomolecular condensates, thought to form via liquid-liquid phase separation of intracellular mixtures, are multicomponent systems that can include diverse types of proteins and RNAs. RNA is a critical modulator of RNA–protein condensate stability, as it induces an RNA concentration-dependent reentrant phase transition—increasing stability at low RNA concentrations and decreasing it at high concentrations. Beyond concentration, RNAs inside condensates can be heterogeneous in length, sequence, and structure. Here, we use multiscale simulations to understand how different RNA parameters interact with one another to modulate the properties of RNA–protein condensates. To do so, we perform residue/nucleotide resolution coarse-grained molecular dynamics simulations of multicomponent RNA–protein condensates containing RNAs of different lengths and concentrations, and either FUS or PR25 proteins. Our simulations reveal that RNA length regulates the reentrant phase behavior of RNA–protein condensates: increasing RNA length sensitively rises the maximum value that the critical temperature of the mixture reaches, and the maximum concentration of RNA that the condensate can incorporate before beginning to become unstable. Strikingly, RNAs of different lengths are organized heterogeneously inside condensates, which allows them to enhance condensate stability via two distinct mechanisms: shorter RNA chains accumulate at the condensate’s surface acting as natural biomolecular surfactants, while longer RNA chains concentrate inside the core to saturate their bonds and enhance the density of molecular connections in the condensate. Using a patchy particle model, we additionally demonstrate that the combined impact of RNA length and concentration on condensate properties is dictated by the valency, binding affinity, and polymer length of the various biomolecules involved. Our results postulate that diversity on RNA parameters within condensates allows RNAs to increase condensate stability by fulfilling two different criteria: maximizing enthalpic gain and minimizing interfacial free energy; hence, RNA diversity should be considered when assessing the impact of RNA on biomolecular condensates regulation.

For our simulations of fused in sarcoma (FUS) (see sequence below) and PR 25 with polyU RNA, we employ the high-resolution sequence-dependent coarse-grained model Mpipi [1], which describes almost quantitatively the temperature-dependent LLPS phase behaviour of different protein condensates such as that of FUS.In addition, this model correctly predicts the multiphase behaviour of the polyR/polyK/polyU system, and recapitulates experimental LLPS trends for sequence mutations on FUS, DDX4 NTD and LAF-1 RRG domain variants [1].Within this force field, electrostatic interactions are modelled through a Coulombic term with Debye-Huckel electrostatic screening [2], given by the sum over all particle-particle (i, j) interactions as: where q is the charge (being -0.75e for the different nucleotides: A, C, G, U; and +0.75e for amino acids such as R and K, +0.375e for H, and -0.75e for D and E residues), ϵ r = 80 is the relative dielectric constant of water, ϵ 0 is the electric constant, κ −1 = 795 pm is the Debye screening length, and r ij is the distance separating particles i and j.For these interactions, a Coulomb cut-off of 3.5 nm is employed.The rest of non-bonded interactions (i.e., hydrophobic, cation-π or π-π) between distinct protein/RNA beads are modelled via the Wang-Frenkel potential [3]: where representing σ ij the molecular diameter of each residue/nucleotide and ϵ ij the interaction strength between distinct amino acids and nucleotides (i and j).While σ ij , ϵ ij and µ ij are parameters specified for each pair of interactions (see Ref. [1]), ν ij and R ij are constant model parameters set to ν ij = 1 and R ij = 3σ ij for every interaction.Finally, bond energy is computed with an harmonic bond potential of the following form: where b is the total number of bonds, r b is the bond distance, k=8.03 Jmol −1 pm −2 is the spring constant and r 0 is the bond reference position, set to 381 pm and 500 pm for protein and RNA bonds respectively.For further details on this force field and the full list of the model parameters please see Ref. [1].The following Protein Data Bank (PDB) codes were used to build the globular structured domains of FUS (residues from 285-371 (PDB code: 2LCW) and from 422-453 (PDB code: 6G99)).

C. Colloidal patchy particle model for protein/RNA phase-separation
For the minimal coarse-grained simulations shown in Fig. 4 of the main text, we employ a patchy particle model [4][5][6][7] in which proteins are described by a pseudo hard-sphere (PHS) potential [8] that accounts for their excluded volume: where λ a =49 and λ r =50 are the exponents of the attractive and repulsive terms respectively, and ε R accounts for the energy shift of the pseudo hard-sphere interaction.On top of this, we add a continuous square-well (CSW) potential for modeling the different protein binding sites, therefore mimicking protein multivalency: where ϵ CSW is the depth of the potential energy well, r w the radius of the attractive well, and α controls the steepness of the well.We choose α = 0.005σ and r w = 0.12σ so that each binding site can only interact with another single one.RNA-protein interactions are modeled with a standard Lennard-Jones (LJ) potential [9]: where ϵ LJ measures the depth of potential and σ the excluded volume between proteins and RNA.The Lennard-Jones potential is employed between protein cores (not binding sites) and RNA beads, while RNA-RNA interactions are modelled via the PHS potential [8], with a repulsive Yukawa potential on top of the PHS one: where we set A at 0.42 kcal• Å / mol and κ at 2.57 Å−1 .In this way, we model RNA as a self-repulsive polymer of bonded repulsive spheres, and protein-RNA interactions via sites that are not the protein-protein binding sites.Hence, if one protein is bonded to RNA, it can still bind to other proteins.In our model, the mass of each patch is 5% of the central PHS particle mass, which is set to 3.32x10 −26 kg, despite this choice being irrelevant for equilibrium simulations.This 5% ratio fixes the moment of inertia of the patchy particles (our minimal coarse-grained proteins).The molecular diameter of the proteins, both scaffold and cognate proteins, as well as the RNA beads is σ = 0.3405 nm, and the value of ε R /k B is 119.81K.With this model, we express magnitudes in reduced units: reduced temperature is defined as T * = k B T /ϵ CSW , reduced density as ρ * = (N/V )σ 3 , reduced pressure as p * = pσ 3 /(k B T ), and reduced time as σ 2 m/(k B T ).In order to keep the PHS interaction as similar as possible to a pure HS interaction, we fix k B T /ε R at a value of 1.5 as suggested in Ref. [8] (fixing T = 179.71K).We then control the effective strength of the binding protein attraction by varying ϵ CSW such that the reduced temperature, T * = k B T /ϵ CSW , is of the order of O(0.1).The cut-off distance for the interactions in this model are 1.17σ for both PHS and CSW potentials and 5σ for the LJ interactions.The ϵ LJ /k B for LJ interactions is set to 152.5K.This model has been proven to qualitatively reproduce the effect of protein valency in LLPS [5], the enhancement of RNA-mediated LLPS in RNA-binding proteins [10] or the multilayer organization [6] and condensate size conservation in scaffold-client mixtures [7].

D. Simulation details
Our Direct Coexistence simulations [11,12] are performed in the NVT ensemble (i.e.constant number of particles (N), volume (V) and temperature (T)), for which we use a Nosé-Hoover thermostat [13,14] with a relaxation time of 5 ps for the Mpipi model simulations and 0.074 in reduced units for the patchy particle simulations.Since all our potentials are continuous and differentiable, we perform all our simulations using the LAMMPS Molecular Dynamics package [15].Periodic boundary conditions are used in the three directions of space.The timestep chosen for the Verlet integration of the equations of motion is 10 fs for the Mpipi model and 3.7x10 −4 in reduced time units for the patchy particle model.For the bulk simulations with the Mpipi model, we employ the N pT ensemble, controlling the pressure (p) with the Nosé-Hoover barostat [16] using a relaxation time of 10 ps.All our simulations were equilibrated for at least 50 ns, ensuring that quantities such as the potential energy of the system and the condensate density remain constant over time.Production runs generally last from 200 ns up to few microseconds depending on the system and the property of interest, being more computationally demanding those simulations aiming to elucidate the condensate architecture (as shown in Figs. 3 (main text) and S1).
PR 25 -polyU simulations were performed with 96 repeats of the protein and varying amounts of RNA, ranging from 1600 to 4800 nucleotides in total, and split into different chains to achieve the desired mass ratio and polyU strand length.For FUS-polyU simulations, we make use of 48 protein replicas and 400 to 1600 nucleotides in total forming RNA chains of different lengths.In the minimal model simulations we made use of 1000 coarse-grained proteins along with 100 to 600 RNA beads in total split into chains of different lengths as indicated in the main text.

SII. COMPUTING PHASE DIAGRAMS VIA DIRECT COEXISTENCE
To calculate the coexisting densities of the phase diagrams shown in Fig. 1b and 4b-c of the main text, we employ the Direct Coexistence method [11,12,17].Within this scheme, the two coexisting phases are simulated by preparing periodically extended slabs of the two phases, the condensed and the diluted one, in the same simulation box.We use an implicit solvent model; accordingly, the diluted phase (protein-poor liquid phase) and the condensed phase (protein-rich liquid phase) are effectively a vapour and a liquid phase, respectively.Once our DC simulations have reached equilibrium, we compute the density profile along the long axis of the box, and thus, we extract the density of the two coexisting phases (as shown in Fig. 1(b) and 3(a-e) of the main text and the Supporting Material of Ref. [10]).From the plateau of the condensed phase and the diluted one, we measure the density (avoiding the interfaces between both phases).To estimate the critical point of the phase diagrams, we use the universal scaling law of coexistence densities near a critical point [18], and the law of rectilinear diameters [19]: and where ρ l and ρ v refer to the coexisting densities of the condensed and diluted phases respectively, while ρ c is the critical density, T c is the critical temperature, and d and s 2 are fitting parameters.On the other hand, the methodology to estimate critical temperatures via NpT bulk simulations is provided in Section II of the main text, and illustrated in Fig. 1b.

SIII. COMPUTING THE INTERFACIAL FREE ENERGY FROM DIRECT COEXISTENCE SIMULATIONS
From Direct Coexistence simulations, we can obtain the interfacial free energy (γ) of the condensates by using the following expression: where L x corresponds to the long side of the simulation box (perpendicular to the slab interfaces), N is the number of droplets in the system, and p N and p T are the normal and tangential components of the pressure tensor with respect to the interfaces of the system (note that the tangential component must be averaged over the two tangential directions).

SIV. SIMULATION CONVERGENCE
To ensure that our simulations reach convergence, we estimate from our bulk N pT simulations (presented in Fig. 1 and 2 of the main text) the approximate relaxation timescale for the systems to equilibrate.This can be easily approximated from the time that it takes the centre-of-mass particle of the largest biomolecule in the system to diffuse at least 1 or 2 times its average (protein or RNA) radius of gyration (as discussed in Refs.[20][21][22]).Once simulated the timescale needed to diffuse such distance (typically between 3 to 5 microseconds), we compute density profiles of the system over time, and we average the profiles from independent blocks to determine the equilibrium condensate structural arrangement (as shown in Fig. 3).Importantly, we note that the impact of a modest equilibration in systems employed to determine the critical temperature can be less dramatic than that required for characterizing the condensate architecture as shown in Fig. 3 of the main text, and Fig S1.

SV. EFFECT OF TEMPERATURE ON THE RNA-PROTEIN CONDENSATE STRUCTURAL ARRANGEMENT
In Fig. 3 of the main text, we present density profiles for FUS-RNA and PR 25 -RNA condensates at T=0.98T c .In order to assess the effect of temperature on the observed condensate architecture, in Fig. S1 we show density profiles for the same condensates but at T/T c 0.95 and 0.85 to demonstrate the mild dependence of condensate organization on temperature.We observe that there is no significant variation with temperature in the local arrangement of proteins and RNA along the condensate, with the density profiles presented in Fig. S1 being very similar to those of Fig. 3 in the main text.

SVI. PROTEIN-RNA CONTACT FREQUENCY MAPS
As mentioned in the main text, the Mpipi model, despite being a residue-resolution coarse-grained model, it is able to qualitatively recapitulate the different contact frequencies that distinct protein domains can establish with RNA [1].Here, following the methodology described in Refs.[21][22][23], we evaluate the contact frequency between polyU RNA strands and the different protein amino acids of both FUS and PR 25 within the condensates.In Figure S2, we show the contact map frequency of the different residues in both FUS-RNA and PR 25 -RNA condensates for the heterotypic protein-RNA interactions.This calculation has been performed at T=0.95T c and in the presence of both long (400-nt) and short (50-nt) RNA chains for both PR 25 -RNA and FUS-RNA systems (i.e., same conditions of the results shown in Fig. S1 for T=0.95T c ).For the case of the PR 25 -RNA system, the U/PR 25 mass ratio was 1.6, while for the FUS-RNA condensate it was U/FUS mass ratio = 0.14.While in PR 25 -RNA droplets protein-RNA   higher number of contacts.Calculations were performed at T=0.95T c (T c being the corresponding critical temperature of each system) and in the presence of both long (400-nt) and short (50-nt) RNA chains for both PR 25 -RNA and FUS-RNA systems (as studied in Fig. S1).For the case of the PR 25 -RNA system, we select a U/PR 25 mass ratio of 1.6, while for the FUS-RNA a U/FUS mass ratio of 0.14.
interactions are quite homogeneous due to the dipeptide repeat distribution of the PR 25 sequence, in FUS-RNA condensates, the domains which preferentially interact with polyU are the RNA-recognition motifs (from the 282nd to the 371st residue), and the three arginine-glycine rich regions (RGG1: 163 to 267; RGG2: 371 to 422; RGG3: 453 to 526), in qualitative agreement with experimental observations [24][25][26].

T
Figure S1: RNA-protein density profiles for the condensates shown in Fig. 3 of the main text and at the temperatures indicated in the different panels.

FUS-
RNA contact frequency map PR25-RNA contact frequency map

Figure S2 :
Figure S2: Contact probability per residue between polyU and the different amino acids composing FUS-RNA (Top panel) and PR 25 -RNA (Bottom panel) condensates.The bar indicates the frequency of amino acid-U contacts (normalized by the averaged total number of protein-RNA contacts), with the darkest colours corresponding to ahigher number of contacts.Calculations were performed at T=0.95T c (T c being the corresponding critical temperature of each system) and in the presence of both long (400-nt) and short (50-nt) RNA chains for both PR 25 -RNA and FUS-RNA systems (as studied in Fig.S1).For the case of the PR 25 -RNA system, we select a U/PR 25 mass ratio of 1.6, while for the FUS-RNA a U/FUS mass ratio of 0.14.
B. FUS Sequence and PDB of the structured domains